Suvrit Sra

92 publications

18 venues

H Index 36

Affiliation

Massachusetts Institute of Technology (MIT), Laboratory for Information and Decision Systems, Cambridge, MA, USA
Max Planck Institute for Biological Cybernetics, T bingen, Germany
University of Texas at Austin, Department of Computer Sciences, Austin, TX, USa

Links

Name	Venue	Year	citations
On the Training Instability of Shuffling SGD with Batch Normalization.	ICML	2023	0
Global optimality for Euclidean CCCP under Riemannian convexity.	ICML	2023	0
The Crucial Role of Normalization in Sharpness-Aware Minimization.	NIPS/NeurIPS	2023	0
Transformers learn to implement preconditioned gradient descent for in-context learning.	NIPS/NeurIPS	2023	0
Sign and Basis Invariant Networks for Spectral Graph Representation Learning.	ICLR	2023	0
CCCP is Frank-Wolfe in disguise.	NIPS/NeurIPS	2022	1
Understanding the unstable convergence of gradient descent.	ICML	2022	16
Max-Margin Contrastive Learning.	AAAI	2022	0
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity.	ICML	2022	0
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective.	ICML	2022	0
Efficient Sampling on Riemannian Manifolds via Langevin MCMC.	NIPS/NeurIPS	2022	0
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond.	ICLR	2022	0
Understanding Riemannian Acceleration via a Proximal Extragradient Framework.	COLT	2022	0
Can contrastive learning avoid shortcut solutions?	NIPS/NeurIPS	2021	44
Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates.	NIPS/NeurIPS	2021	2
Provably Efficient Algorithms for Multi-Objective Competitive RL.	ICML	2021	13
Three Operator Splitting with a Nonconvex Loss Function.	ICML	2021	4
Online Learning in Unknown Markov Games.	ICML	2021	30
Coping with Label Shift via Distributionally Robust Optimisation.	ICLR	2021	0
Contrastive Learning with Hard Negative Samples.	ICLR	2021	0
Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?	COLT	2021	0
SGD with shuffling: optimal rates without component convexity and large epoch requirements.	NIPS/NeurIPS	2020	26
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes.	NIPS/NeurIPS	2020	21
Why are Adaptive Methods Good for Attention Models?	NIPS/NeurIPS	2020	91
Strength from Weakness: Fast Learning Using Weak Supervision.	ICML	2020	23
From Nesterov's Estimate Sequence to Riemannian Acceleration.	COLT	2020	50
Geodesically-convex optimization for averaging partially observed covariance matrices.	ACML	2020	2
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions.	ICML	2020	19
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition.	ICML	2020	62
Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity.	ICLR	2020	0
Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator.	ICML	2019	37
Flexible Modeling of Diversity with Strongly Log-Concave Distributions.	NIPS/NeurIPS	2019	10
Escaping Saddle Points with Adaptive Gradient Methods.	ICML	2019	59
Are deep ResNets provably better than linear predictors?	NIPS/NeurIPS	2019	11
Random Shuffling Beats SGD after Finite Epochs.	ICML	2019	0
Learning Determinantal Point Processes by Corrective Negative Sampling.	AISTATS	2019	0
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity.	NIPS/NeurIPS	2019	0
Direct Runge-Kutta Discretization Achieves Acceleration.	NIPS/NeurIPS	2018	92
Exponentiated Strongly Rayleigh Distributions.	NIPS/NeurIPS	2018	12
An Estimate Sequence for Geodesically Convex Optimization.	COLT	2018	43
Non-Linear Temporal Subspace Representations for Activity Recognition.	CVPR	2018	39
Modular Proximal Optimization for Multidimensional Total-Variation Regularization.	JMLR	2018	0
A Generic Approach for Escaping Saddle points.	AISTATS	2018	0
Elementary Symmetric Polynomials for Optimal Experimental Design.	NIPS/NeurIPS	2017	18
Polynomial time algorithms for dual volume sampling.	NIPS/NeurIPS	2017	31
Combinatorial Topic Models using Small-Variance Asymptotics.	AISTATS	2017	0
Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization.	NIPS/NeurIPS	2016	155
Kronecker Determinantal Point Processes.	NIPS/NeurIPS	2016	26
Stochastic Variance Reduction for Nonconvex Optimization.	ICML	2016	503
First-order Methods for Geodesically Convex Optimization.	COLT	2016	205
Geometric Mean Metric Learning.	ICML	2016	136
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds.	NIPS/NeurIPS	2016	169
Fast DPP Sampling for Nystrom with Application to Kernel Methods.	ICML	2016	73
AdaDelay: Delay Adaptive Distributed Stochastic Optimization.	AISTATS	2016	31
Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling.	NIPS/NeurIPS	2016	33
Gaussian quadrature for matrix inverse forms with applications.	ICML	2016	0
Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms.	ICML	2016	0
Efficient Sampling for k-Determinantal Point Processes.	AISTATS	2016	0
Fixed-point algorithms for learning determinantal point processes.	ICML	2015	44
Matrix Manifold Optimization for Gaussian Mixtures.	NIPS/NeurIPS	2015	73
Data modeling with the elliptical gamma distribution.	AISTATS	2015	6
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants.	NIPS/NeurIPS	2015	191
Large-scale randomized-coordinate descent methods with non-separable linear constraints.	UAI	2015	0
Efficient Structured Matrix Rank Minimization.	NIPS/NeurIPS	2014	19
Towards an optimal stochastic alternating direction method of multipliers.	ICML	2014	59
Randomized Nonlinear Component Analysis.	ICML	2014	165
Riemannian Sparse Coding for Positive Definite Matrices.	ECCV	2014	55
Fast Newton methods for the group fused lasso.	UAI	2014	18
Geometric optimisation on positive definite matrices for elliptically contoured distributions.	NIPS/NeurIPS	2013	28
Jensen-Bregman LogDet Divergence with Application to Efficient Similarity Search for Covariance Matrices.	TPAMI	2013	157
Reflection methods for user-friendly submodular optimization.	NIPS/NeurIPS	2013	76
Fast projections onto mixed-norm balls with applications.	DMKD	2012	28
Scalable nonconvex inexact proximal splitting.	NIPS/NeurIPS	2012	64
A new metric on the manifold of kernel matrices with application to matrix geometric means.	NIPS/NeurIPS	2012	133
Fast Newton-type Methods for Total Variation Regularization.	ICML	2011	87
Fast Projections onto ℓ1, q -Norm Balls for Grouped Feature Selection.	ECML/PKDD	2011	39
Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval.	ECML/PKDD	2011	50
Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet Divergence.	ICCV	2011	77
A scalable trust-region algorithm with application to mixed-norm regression.	ICML	2010	41
Efficient filter flow for space-variant multiframe blind deconvolution.	CVPR	2010	232
Convex Perturbations for Scalable Semidefinite Programming.	AISTATS	2009	9
Workshop summary: Numerical mathematics in machine learning.	ICML	2009	0
Block-Iterative Algorithms for Non-negative Matrix Approximation.	ICDM	2008	6
Fast Newton-type Methods for the Least Squares Nonnegative Matrix Approximation Problem.	SDM	2007	139
Information-theoretic metric learning.	ICML	2007	0
Incremental Aspect Models for Mining Document Streams.	ECML/PKDD	2006	18
Efficient Large Scale Linear Programming Support Vector Machines.	ECML/PKDD	2006	19
Generalized Nonnegative Matrix Approximations with Bregman Divergences.	NIPS/NeurIPS	2005	479
Clustering on the Unit Hypersphere using von Mises-Fisher Distributions.	JMLR	2005	874
Triangle Fixing Algorithms for the Metric Nearness Problem.	NIPS/NeurIPS	2004	19
Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data.	SDM	2004	327
Generative model-based clustering of directional data.	KDD	2003	122